Blog

How 4 genAI tools stack up – Computerworld


Results: Claude, NotebookLM, and ChatGPT answered with str_squish(), which I consider the correct answer. Perplexity assumed I only cared about space at the beginning and end of the text and not in the middle. After a follow-up question, it also found the best function.

Test 2: Somewhat vague search of my social media posts

This was a more difficult task, but something similar to what people might want help with in the real world.

Question: “I really liked an article about LLMs written by Lucas Mearian at Computerworld. Please tell me the specifics based on my LinkedIn posts that I uploaded.”

Info source: 2 years of my LinkedIn posts.

Results: NotebookLM and Claude nailed their responses, each offering two options including the one I wanted. ChatGPT gave me somewhat related articles, but not the one I wanted. (I’d been looking for “What are LLMs, and how are they used in generative AI?”)

Perplexity with its default Auto LLM didn’t give me anything useful, claiming “there is no specific mention of an article about Large Language Models (LLMs) written by Lucas Mearian.”

Test 3: Find a US Census table ID for a specific topic

A lot of businesses use the US Census Bureau’s American Community Survey (ACS) for demographic information. With thousands of available data variables, it can be hard to find one that has the information you want. This type of query could represent a lot of other data lookups businesses might want to do with their own data.

Question: “What is the best variable to use to find information about the percent of workers who work from home?”

Info source: I downloaded and filtered several listings of ACS table variable IDs (filtered because a couple of the lists were too large), along with a general explanation of ACS tables from the Census Bureau website. Since some of these platforms don’t accept CSV files in projects, I saved the variable data as tab-delimited .txt files.

Expected response: Kyle Walker, director of the Center for Urban Studies at Texas Christian University and author of the tidycensus R package, used the DP03_0024P variable in one of his examples, so that’s what I was expecting in a correct answer.

Results: NotebookLM, ChatGPT, and Perplexity all gave me results I could use. (Unexpectedly, I learned that there is more than one correct answer — ChatGPT and Perplexity both found other variables that include the percent of people working from home.)

Claude couldn’t compete on this one, since my three .txt files with data totaling less than 800KB exceeded its “project knowledge” limit.

Test 4: Ask about professional conferences

This test featured two questions for two different data sources: Ask about a conference that might fit my needs, and then ask about conference sessions at one specific conference.

Question 1: “I’m looking for IDG events that will talk about artificial intelligence. I’d like them to be within a 2-hour flight or so from Boston.”

Info source: The IDG global events calendar PDF.

Expected result: The most complete correct answer would cite FutureIT New York in July and FutureIT Toronto April 30 – May 1. Work+ in Nashville at a 2:50 flight would also be a reasonable suggestion.

Results: ChatGPT nailed it with both its more advanced o3-mini-high model and its general 4o LLM, returning the two events that exactly match the criteria.

Perplexity’s Sonar LLM returned both events as well as the CIO100 conference in Arizona, although acknowledging that one is beyond a 2-hour flight.

NotebookLM got it partially right, suggesting FutureIT New York and Work+ in Nashville (which it accurately said was “reasonably close to Boston” — true, it’s less than 3 hours away). However, it missed Toronto.

Claude with its older Sonnet 3.5 model returned both matching events, along with “UK Events for reference, though outside your travel range” — but did not include Nashville. Claude with its newer Sonnet 3.7 in its default setting was worse, finding only one that matched, a couple of others in the US, and two in Europe (noting that those were outside the travel range). When I changed Sonnet 3.7 from its default to “extended” reasoning, it gave a better response: both the New York and Toronto events as well as a virtual event.

Question 2: “Tell me all the sessions at the NICAR conference for people who are already proficient in spreadsheets — that is, they are not beginners, but they want to improve their spreadsheet skills.”

Data source: Text file of the full NICAR data journalism conference schedule.

Results: NotebookLM gave me more than a dozen interesting suggestions involving Google Sheets, Excel, and Airtable, with only one that might not have been relevant. It was definitely more than I would have found by simply searching the conference web page for “Excel” and “Sheets.” Plus, because I could click to zoom into the exact schedule text it cited, it was easy to check for hallucinations.

Brainstorming is one area where many experts say LLMs can shine. I plan to upload other conference schedules to NotebookLM in the future to make sure I don’t overlook potentially useful sessions.

ChatGPT also came up with 12+ sessions that could be of interest, arranged by date and time and more nicely formatted. Claude proposed slightly fewer, but all seemed to match.

Perplexity was disappointing, claiming: “While the provided information does not explicitly list sessions for those proficient in spreadsheets, several sessions at the NICAR 2025 conference could be beneficial for improving spreadsheet skills or learning advanced data analysis techniques.” It suggested only three.

Recommendations

Generative AI cloud services can be a helpful, no-code way to answer questions about your own information — both finding info you know exists and helping you discover new insights.

If you want a platform that’s easy, free, and cites sources so you can check for hallucinations, Google’s NotebookLM is an excellent choice.

If you already subscribe to ChatGPT, its projects are worth a test. They’re set up to handle a wider range of requests than simply Q&A, and ChatGPT’s responses are often better formatted and easier to read than NotebookLM’s. If you’re a free user, you can upload files to conventional ChatGPT chats and get similar capabilities.

Claude may be a good option if you don’t have large amounts of data per project and you’re already subscribing, especially if you want it to answer questions about data in a GitHub repository. If one response is unsatisfactory, try changing model settings.

I found Perplexity to be more compelling for answering questions about information on the web, especially for use cases like software help where the info is spread over a lot of different files within a domain such as slack.com/help. However, I’d probably go with NotebookLM or ChatGPT for local data.


Source link

Related Articles

Back to top button
close